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Abstract 

We study deterministic extractors for oblivious bit-fixing sources (a.k.a. 
resilient functions) and exposure- resilient functions with small min-entropy: 
of the function's n input bits, k <gi n bits are uniformly random and un- 
known to the adversary. 

We simplify and improve an explicit construction of extractors for 
bit-fixing sources with sublogarithmic k due to Kamp and Zuckerman 
(SICOMP 2006), achieving error exponentially small in k rather than 
polynomially small in k. Our main result is that when k is sublogarithmic 
in n, the short output length of this construction (O(logfc) output bits) 
is optimal for extractors computable by a large class of space-bounded 
streaming algorithms. 

Next, we show that a random function is an extractor for oblivious bit- 
fixing sources with high probability if and only if k is superlogarithmic in 
n, suggesting that our main result may apply more generally. In contrast, 
we show that a random function is a static (resp. adaptive) exposure- 
resilient function with high probability even if k is as small as a constant 
(resp. log log n). No explicit exposure-resilient functions achieving these 
parameters are known. 

Keywords: pscudorandomness, exposure-resilient function, randomness ex- 
tractor, bit-fixing source 

1 Introduction 

Randomness extractors are functions that extract almost-uniform bits from 
weak sources of randomness (which may have biases and/or correlations). Ex- 
tractors can be used for simulating randomized algorithms and protocols with 
weak sources of randomness, have close connections to many other "pseudoran- 
dom objects" (such as expander graphs and error-correcting codes), and have a 
variety of other applications in theoretical computer science. 

'Some of these results previously appeared in the first author's undergraduate thesis [Res]. 
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The most extensively studied type of extractor is the seeded extractor, intro- 
duced by Nisan and Zuckerman [NZ]. These extractors are given as additional 
input a small "seed" of truly random bits to use as a catalyst for the random- 
ness extraction, and this allows for extracting almost-uniform bits from very 
unstructured sources, where all we know is a lower bound on the min-entropy. 
In many applications, such as randomized algorithms, the need for truly ran- 
dom bits can be eliminated by trying all possible seeds and combining the results 
(e.g. by majority vote). However, prior to the Nisan-Zuckerman notion, there 
was a substantial interest in deterministic extractors (which have no random 
seed) for restricted classes of sources. Over the past decade, there has been a 
resurgence in the study of deterministic extractors, motivated by settings where 
enumerating all possible seeds does not work (e.g. distributed protocols) and 
by other applications in cryptography. 

In this paper, we study one of the most basic models: an oblivious bit-fixing 
source ( OBFS) is an rt-bit source where some n — k bits are fixed arbitrarily 
and the remaining k bits are uniformly random. Deterministic extractors for 
OBFSs, also known as resilient functions (RFs), were first studied in the mid- 
80's, motivated by cryptographic applications [Vaz, BBR, CGH+]. A more 
relaxed notion is that of an exposure-resilient function (ERF), introduced in 
2000 by Canetti et al. [CDH+]. Here all n bits of the source are chosen uniformly 
at random, but n — k of them are seen by an adversary; an ERF should extract 
bits that are almost-uniform even conditioned on what the adversary sees. ERFs 
come in two types: static ERFs, where the adversary decides which n — k bits 
to see in advance, and adaptive ERFs, where the adversary reads the n — k bits 
adaptivcly. In recent years, there has been substantial progress in giving explicit 
constructions of both RFs and ERFs [CDH+, DSS, KZ, GRS]. 

In this paper, we focus on the case when k, the number of random bits 
unknown to the adversary, is very small, e.g. k < log n. While this case is not 
directly motivated by applications, it is interesting from a theoretical perspective 
for a couple of reasons: 

• For many other natural classes of sources (several independent sources [CG], 
samplable sources [TV], and affine sources [BKS + ]), at least logarithmic 
min-entropy is necessary for extraction. 1 

• This is a rare case where a random function is not an optimal extractor. 
For example, the parity function extracts one completely unbiased bit from 
any bit-fixing source with k = 1 random bits, but we show that a random 
function will fail to extract from some such source with high probability. 

Our first results concern explicit constructions of extractors for OBFS with 
k sublogarithmic in n. 

• We simplify and improve an explicit construction of extractors for OBFSs 
with small k by Kamp and Zuckerman [KZ]. In particular, the error 
parameter of our construction can be exponentially small in k, whereas the 

x For the case of 2 independent sources, the need for logarithmic min-entropy is proven in 
[CG]. For sources samplable by circuits of size s = n 2 , it can be shown by noting that the 
uniform distribution on any 2 k elements of {0, l} fc+1 o o n ~ fe -i is samplable by a circuit of 
size 0(n ■ 2 k ) (and we can pick 2 k elements on which the first bit of the extractor is constant). 
For affine sources, it can be shown by analyzing the fc-th Gowers norm of the set of inputs on 
which the first bit of the extractor is constant (as pointed out to us by Ben Green). 
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Kamp-Zuckerman construction achieves error that is polynomially small 
in k. Our extractor (like that of [KZ]) extracts only O(logfc) almost- 
uniform bits, in contrast to extractors for superlogarithmic k, which can 
extract nearly k bits. 

• We prove that, when k is sublogarithmic, the <d(\ogk) output length of 
our extractor is optimal for extractors for OBFSs computable by space- 
bounded streaming algorithms with a certain "forgetlessness" property. 
The class of streaming algorithms we analyze includes our construction as 
well as many natural random- walk based constructions. This is our main 
result. 

Next, we investigate properties of random functions as extractors for OBFS's 
and find that k ~ log n appears to be a critical point for extractors for OBFSs 
in this setting as well. Specifically, we show that: 

• A random function is an extractor for OBFSs (with high probability) if 
and only if k is at least roughly logn. 

• In contrast, for the more relaxed concept of exposure-resilient functions, 
random functions suffice even for sublogarithmic k. For static ERFs, k 
can be as small as a constant, and for adaptive ERFs, k can be as small 
as log log n. 

All of the results concerning random functions yield resilient/exposure-resilient 
functions that output nearly k almost-uniform bits. 

2 Preliminaries 

Throughout, we will use the convention that a lowercase number (e.g. n) im- 
plicitly defines a corresponding capital number (iV) as its exponentiation with 
base 2 (i.e. N = 2 n ). 

Definition 2.1 (Statistical Distance). Let X and Y be two random variables 
taking values in a set S. The statistical distance A(X,Y) between X and Y is 

A (X, Y) = max |Pr [X £ T] - Pr [Y £ T]\ = 1 ^ l Pr [X = w] - Pr [Y = w]\ 

TCS w£S 

We will write X m e Y to mean A(X, Y) < e, and we will use U n to denote 
the uniform distribution on {0, 1}". When U n appears twice in the same set 
of parentheses, it will denote the same random variable. For example, a string 
chosen from the distribution (U n , U n ) will always be of the form wow for some 
w £ {0, 1}™. Note that (U n , U m ) still equals U n+m . 

Definition 2.2 (Oblivious Symbol-Fixing Source). An (n, k, d) oblivious symbol- 
fixing source ( OSFS) A is a source consisting of n symbols, each drawn from [d], 
of which all but k are fixed and the rest are chosen independently and uniformly 
at random. 

Definition 2.3 (Oblivious Bit-Fixing Source). An (n, k) oblivious bit-fixing 
source ( OBFS) is an (n, k, 2) oblivious symbol-fixing source. 
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We will use {^} to denote the set {L C [n] : \L\ = £} and, given some 
L £ {"} and a string a £ {0, l} e , we will write L a ' n to denote the oblivious 
bit-fixing source that has the bits with positions in L fixed to the string a. 

Definition 2.4 (Deterministic Randomness Extractor). Let C be a class of 
sources on {0, 1}™. A deterministic e-extractor for C is a function E: {0, 1}™ — > 
{0, l} m such that for every X £ C we have £(A) « E C/ m . 

Here we will focus mainly on deterministic randomness extractors for obliv- 
ious bit-fixing sources, also known as resilient functions (RFs). 

Definition 2.5 (Resilient Function). A (k, e)-RF is a function /: {0,1}" — ► 
{0, l} m that is a deterministic e-extractor for (n, k) oblivious bit-fixing sources. 

We can also characterize extractors for OBFSs by their ability to fool a 
distinguisher: consider a computationally unbounded adversary A that can set 
some of /'s input bits in advance but must allow the rest to be chosen uniformly 
at random. Then / satisfies Definition 2.5 if and only if A is unable to distinguish 
between f's output and the uniform distribution regardless of how A changes 
fs input. 

When viewed through this lens, the notion of deterministic extraction from 
OBFSs has a natural relaxation obtained by restricting A to only read (rather 
than modify) a portion of fa input bits. Functions that are able to fool ad- 
versaries of this type are called exposure-resilient functions (ERFs). We define 
below the two simplest variants of exposure-resilient functions, which correspond 
to whether A reads the bits of fs input all at once or one at a time. 

Definition 2.6 (Static Exposure-Resilient Function). A static (k, e)-ERF is a 
function /: {0,1}" -> {0, l} m with the property that for every L £ { n "J , / 
satisfies (U n \ L ,f(U n )) « e (U n \ L ,U m ). 

This definition can be restated in terms of average-case extraction using the 
following lemma, whose proof can be found in [Res] . 

Lemma 2.7. A function f : {0, 1}" — > {0, 1}™ is a static (k, e)-ERF if and only 
if for every L £ { n " fe } ; f satisfies 

E [A(f(L*>»),U m )]<e 

Allowing the adversary to adaptively request bits of fa input one at a time 
gives rise to the strictly stronger notion of an adaptive ERF: 

Definition 2.8 (Adaptive Exposure-Resilient Function). An adaptive (k,e)- 
ERF is a function /: {0, 1}" — > {0, l} 1 " with the property that for every algo- 
rithm A: {0, 1}" — > {0, 1}* that can (adaptively) read at most n — k bits of its 
input, 2 / satisfies (A(U n ), f{U n )) w e (A(U n ),U m ). 

The following lemma will allow us to restrict our attention to algorithms A 
that simply output the values of the bits that they request as they receive them 
(rather than outputting some function of those bits). 

2 In other words, A is a binary decision tree of depth n — k — 1 with leaves labelled by its 
output strings and each internal node labelled by the position of the bit that A requests at 
that juncture. 
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Lemma 2.9. Let A: {0,1}™ — > {0,1}* be an adaptive adversary that reads 
at most d bits of its input and let A r : {0, 1}™ — > {0, 1}* be the algorithm 
that adaptively reads the same bits as A and outputs them in the order that 
they were read. For every function f : {0,1}" — > {0, l} m , the statistical dis- 
tance between (A(U n ), f(U n )) and (A(U n ),U m ) is at most the distance between 
(A r {U n ),f(U n )) and(A r (U n ),U m ). 

Proof. First, modify A r by padding its output with 0's so that its output length 
is always d. Now define a second algorithm A p : {0, l} d — > {0, 1}* as follows: on 
an input x £ {0, l} d , A p runs A, sequentially feeding it the bits of x in response 
to A's requests, and then outputs A's output. The fact that A = A p o A r then 
implies the desired result. □ 

3 A simplification and a lower bound 

In this section, we prove that when the entropy parameter k is sublogarithmic 
in the input length n, an output length of 0(\ogk) is optimal for a natural class 
of space-bounded streaming algorithms, including algorithms that use the input 
bits to conduct a random walk on a graph. Before we state this lower bound, 
we give a simple improvement on the state of the art in explicit constructions 
of extractors for oblivious bit-fixing sources (i.e. resilient functions) for sublog- 
arithmic entropy. Our lower bound then shows that the parameters achieved by 
this construction are optimal. 

3.1 The simplification 

We start with a simplification of a previous construction due to [KZ], The 
previous construction is based on very good extractors for oblivious symbol- 
fixing sources with d > 3 symbols obtained by using the symbols of the input 
string to take a random walk on an expander graph of degree d. Since expander 
graphs do not exist with degree d = 2, this approach could not be used for 
oblivious bit-fixing sources. However, the construction of [KZ] uses the fact 
that while a random walk on an expander is not an option, a random walk on 
a cycle still extracts some randomness even when the entropy k of the input is 
very small. Our construction is a slight modification of this random walk that 
simplifies the argument and improves the error parameter. 

Theorem 3.1. For every n £ N, k £ [n], e > 0, andm = |(logfc— log log (1/s)), 
the function f : {0, 1}" -> {0, l} m defined by 

n 

f(w)=J2 w i (mod2 m ) 

i=l 

is a (k,e)-RF. In particular, setting e = gives output length m = \ logfc. 

Proof. We can treat / as computing the endpoint of a walk on If Ml (where 
M = 2 m ) that starts at and either adds 1 or to its state with every bit that 
it reads. Since the endpoint of this walk does not depend on the order in which 
the input bits are processed, we may assume without loss of generality that all 
of the fixed bits in fs input come at the beginning. These bits only change the 
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starting vertex of the random walk and do not affect the distance from uniform 
of the resulting distribution. Therefore, to bound the distance from uniform of 
any distribution of the form f(L*' n ) we need only bound the mixing time of a 
walk on Z/MZ consisting of k random steps. The following claim, whose proof 
we defer to the appendix, accomplishes this. 

Claim 3.2. Let Wu be the distribution on the vertices ofZ/MZ (where M = 2 m ) 
obtained by beginning at and adding 1 or with equal probability k times. The 
distance from uniform of Wk is at most 

e -kTr 2 /2M 2 
2(1 — e -3kv 2 /2M 2 \ 

Since k > M 2 , the bottom of the fraction in Claim 3.2 is bounded from 
below by 2(1 - e~ 37r /2 ) > 1 and so we have bounded the distance from uniform 
by e~ k7T l 2M . With our setting of parameters this is at most £ lo s( e ) 7r I 2 < e, as 
desired. □ 

The difference between this construction and that of [KZ] is that each step 
of the random walk carried out by / consists of adding either 1 or rather than 
1 or —1 to the current state. This has two advantages. First, the random walk 
in the construction of [KZ] cannot be carried out on a graph of size 2 m since any 
even-sized cycle is bipartite and the walk traverses an edge at each step. This 
necessitates an additional lemma about converting the output of the random 
walk to one that is almost uniformly distributed over {0,1}™, which incurs 
at error polynomially related to k. 3 By eliminating the need for this lemma, 
the construction of Theorem 3.1 manages to achieve an exponentially small 
error parameter. Second, setting to = 1 in the construction of Theorem 3.1 
makes it clear that the idea underlying both it and the [KZ] construction is 
simply a generalization of bitwise addition modulo 2 — the parity function — 
which extracts 1 uniformly random bit whenever k > 1. 

As discussed previously, this construction achieves output length only loga- 
rithmic in k. This is considerably worse than the output length of k— 2 log (1/e)— 
O(l) which we show to be possible both for extractors for OBFSs with k > logn 
(Section 4.1) and for ERFs (Section 4.2). The lower bound we prove in the fol- 
lowing section shows why this is the case. 

3.2 The lower bound 

The extractor of Theorem 3.1 is a symmetric function; that is, its output is not 
sensitive to the order in which the input bits are arranged. We begin building 
our more general negative result by first showing that extractors for OBFSs 
with this property cannot have superlogarithmic output length. 

Lemma 3.3. Suppose that X = L a ' n is an (n, k)-OBFS and that f : {0, 1}" -> 
{0, l} m is a symmetric function of the input bits in [n] — L. (That is, for every 
permutation ir: [n] —> [n] that fixes L, f(x n m,...,x 7r t n \) = f(xi,...,x n ).) 
Then f(X) « E U m implies that m < log (fc/(l — e))- 

3 This additional error was overlooked in [KZ], and their Theorem 1.2 erroneously claims 
an error exponentially small in k. 
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Proof. By the symmetry of / on the bits in [n] — L, the size of the support 
of f(X) is at most k. (The output depends only on the number of input bits 
in [n] — L that equal 1.) Thus, the distance between /(X) and U m is at least 
(M - k)/M. Together with f(X) « e U m , this implies that e > (M - k)/M, 
which is equivalent to m < log (fc/(l — e)). □ 

We can use Lemma 3.3 to show that no symmetric function with large output 
length can be even a static ERF. 

Proposition 3.4. If a symmetric function f : {0,1}™ — » {0, l} m is a static 
(k, e) -ERF then m < log (k/(l - e)). 

Proof. From Lemma 2.7, we have that for / to be a static ERF, it must satisfy, 
for all sets L € {„™ fe }, 

E [A(f(L a > n ),U m )]<e 

ai-U n _ k 

It follows by averaging that there exists a set L and a string a such that 
f(L a ' n ) sa e U m . Application of Lemma 3.3 to the source L a ' n then yields the 
result. □ 

Since every deterministic e-extractor for (n, fc)-OBFSs is a static (k, e)-ERF 
and every adaptive (k, e)-ERF is also a static (k, e)-ERF, Proposition 3.4 applies 
to extractors for OBFSs and adaptive ERFs as well. Thus, Proposition 3.4 shows 
that constructions like that of Theorem 3.1 and that of [KZ] are optimal. 

However, there are many natural candidates for extraction from OBFSs that 
are similar to that of Theorem 3.1 but are not symmetric, such as the analogous 
random walk on a directed version of a 3-regular or 4-regular expander graph. 
For instance, we could try the graph with vertex set F p where the edge labelled 
from vertex x goes to x + 1 and the edge labelled 1 goes to a; -1 (or in case 
x = 0). The undirected version of this graph is known to be an expander [Lub], 
so we might hope that with k random steps we can reach an almost uniform 
vertex even for p = 2 n ( fc ) and thus output fl(k) almost-uniform bits. 

F p with inverse cords rather than an undirected cycle. It turns out that such 
constructions do no better, as we now show by extending the above lower bound 
for extractors for OBFSs to a large class of small-source streaming algorithms. 
We start by defining the model of computation that we will assume. 

Definition 3.5 (Streaming Algorithm). A streaming algorithm A: {0,1}™ — > 
{0, 1}™ is given by a 5-tuple (V, Vo, S°, S 1 , <p), where V is the state space, v n e V 
is the initial state, S° = (ct°, . . . , cr°) and E 1 = (<r{, . . . , er^) are two sequences 
of functions from V to itself, and tp is a function from V to {0, l} m . On an 
input sequence (pi,..., b n ) € {0, 1}™, A computes by updating its state using 
the rule Vi+i — a^(vi). A's output is A(bi, . . . , b n ) — ip(v n ). The function ip is 
called the output function of A, and the space of A is log \ V\. 

We say that A is forgetless if and only if for every i at least one of either er^ 
or <Tj 1 is a permutation. (Thus, if the i-th bit is fixed to a certain value, A does 
not "forget" anything about its state when reading that bit.) 

Forgetless streaming algorithms include random walks on 2-regular digraphs 
that are consistently labelled (meaning that the edges labelled b form a permu- 
tation, for each b g {0,1}), like the graph on ¥ p mentioned above. However, 
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forgetless streaming algorithms are more general in the sense that they can com- 
pute random walks in which each step of the walk is conducted on a different 
graph. 

We now show that forgetless streaming algorithms with small space cannot 
compute extractors for OBFSs with large output length (for small k). This is 
our main result. 

Theorem 3.6. Suppose that f : {0, 1}" — > {0, l} m is a deterministic e-extractor 
for (n, k)-OBFSs that can be computed by a forgetless streaming algorithm with 
space s < log (n/k)/k. Then m < log (fc/(l — s)). 

Proof Fix an e-extractor for (n, /c)-OBFSs /: {0,1}™ -> {0, l} m and let A be 
a forgetless streaming algorithm with space s < log (n/k)/k that computes /. 
To show that m < log (fc/(l — e)), we will first reduce to a special case in which 
we can make some simplifying assumptions about A. We will then construct an 
oblivious bit-fixing source X such that / is symmetric on the set of bit positions 
not fixed by X. This will allow us to apply Lemma 3.3 to obtain our result since 
/ must map X close to uniform. 

Reduction to the special case: Let S° and E 1 be the sequences of functions used 
by A, and let ip be its output function. We reduce to the special case that every 
element of E° is the identity. 

Since A is forgetless, we can switch some of the functions of and a\ to make 
every function in S° a permutation while preserving the fact that A computes 
a (fc,e)-RF. (This corresponds to just negating some input bits.) This allows 
us to define a new sequence of functions F = {/i, . . . , /„} and a new output 
function tp by the following relations. 

cr9 o • • • o (jj o f i = a\ o (t- j o • • • o a\ 
ijj = ipoa"o---oa° 

Then [V, vq, (id, id, . . . , id), (/i, . . . , / n ), ip) can be verified to be a streaming al- 
gorithm that computes the same function as (V, vo, E°, S 1 , <p). 

Constructing the source X: Letting S — 2 s , we can choose a set Fi C F of 
size at least n/S such that all the functions in Fi map the initial state vq to 
some common state (call it vi). We can then choose a set F2 C Fi of size at 
least n/S 2 such that all functions in F2 map v\ to some common state, which 
we call t>2- Continuing in this way, we obtain a set F^ C F of size at least 
n/S k and a sequence (v , . . . , Vk) with the property that every / € F^ satisfies 
f{vi) — Vi+i for < i < k. We now define X to be the oblivious bit- fixing 
source that has the bits at positions that correspond to functions in Fk un-fixcd 
and the rest of the bits fixed to 0. By our assumption that s < log (n/k)/k, we 
have \Fk\ > n/S k > k, meaning that X has at least k unfixed bits. 

Obtaining the desired bound: For any string w in the support of X, /'s out- 
put will be tp{ v H(w)) where H(w) is the Hamming weight of w. Therefore / is a 
symmetric function of the bits in positions not fixed by X . Since X contains at 
least k independent, uniformly random bits and / is a (k, e)-resilient function, 
Lemma 3.3 yields m < log (k/(l — e)) as desired. □ 

What does this theorem tell us about extraction in low-entropy settings? 
If we set s = m < k (as in the walk on the cycle of Theorem 3.1) then The- 
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orem 3.6 implies that when k < \/log n — log log n we are confined to output 
length m < log (k/ (1 — e)). In other words, the output length of Sl(log k) offered 
by Theorem 3.1 is close to optimal for extractors in this model when k < \/log n. 

We note here a separate, trivial space lower bound that applies even to the 
forgetful case: since streaming algorithms under our model cannot produce any 
output bits until they have read all the input bits, we have s > m — 1 when 
e < 1/2. This bound can in fact be generalized to streaming algorithms that are 
allowed to output bits at any point in their computation by a simple adaptation 
of a space lower bound for strong extractors proven in [BRST]. The resulting 
lower bound says that s > m — 4 when e < 1/8 and k < n/2 for extractors for 
OBFSs computable by any streaming algorithm. 

4 Non-constructive results 

We now turn to determining for what values of the entropy parameter k it is 
possible to achieve output length m — f2(fc) using the probabilistic method. Here 
we find that the results are roughly in agreement with our explicit lower bounds 
from the previous section. That is, a randomly chosen function /: {0, 1}" — ► 
{0, l} m will almost always be an extractor for OBFSs with output length m — 
il(k) when k is larger than logn, and this output length cannot be achieved 
using the probabilistic method when k < log n. 

We then show that random functions can do better in the more relaxed realm 
of exposure-resilient functions: a randomly chosen function is almost always a 
static ERF with optimal output length for any k, and an adaptive ERF with 
optimal output length when k is larger than log log n. 

Before we proceed, we state a Chernoff bound and a partial converse to it 
that we will use in proving these results. A sketch of the proof of Lemma 4.2 is 
given in the appendix. 

Lemma 4.1 (A Chernoff bound). Let X%, . . . , X t be independent random vari- 
ables taking values in [0,1], and let X — (Y]- Xj)/t and (i = E[X]. Then for 
every < e < 1, we have 

Pr [\X -n\>e]< 2e" t£2 / 2 < 2"L n ( fc2 )J 

Lemma 4.2 (Partial converse of Chernoff bound). Let Xi, . . . , X t represent the 
results of independent, unbiased coin flips, and let X = Q2i Xi)/t. Then for 
every < e < 1/2, we have 











x- 1 


> e 




2 





> 2 -r°(fc 2 )i 



4.1 Deterministic extractors for OBFSs 

Theorem 4.3 below, which follows from a straightforward application of the 
Chernoff bound stated in Lemma 4.1, shows that the probabilistic methods gives 
extractors for OBFSs with k > log n. Theorem 4.4 then shows that k > log n is 
the best we can do using the probabilistic method. 

Theorem 4.3. For every n S N, k € [n], and e > 0, a randomly cho- 
sen function f : {0,1}™ -> {0, l} m with m < k- 21og(l/e) - 0(1) and k > 



9 



maxjlog (n — k), log log (?]} + 2 log (1/e) + 0(1) is a deterministic e-extractor 
for (n, k)-OBFSs with probability at least 1 - 2~ n ( Ke2 \ where K = 2 k . 

Proof. Fix an (n, fc)-OBFS X. Choosing the function / consists of indepen- 
dently assigning a string in {0, l}™ 1 to each string in the support of X. In order 
for / to map X close to uniform, we need to have chosen it such that, for every 
fixed statistical test T C {0, l} m , the fraction of strings in X mapped by / into 
T is very close to the density of T in {0, l} m . This is expressed formally by the 
condition below. 

\r\T)\ in 



2 k 2 Tl 



< e 



Now fix one specific test T C {0, l} m . For each string w in the support of X, 
define the indicator variable I w to be 1 if f(w) £ T and otherwise. Then 
Lemma 4.1 (our Chernoff bound) applied to = l/~ 1 ( T )l/2 fe shows 

that / fails the condition above with probability at most 2~^ Ke ^ . 

There are 2 M possible tests T C {0, l} m (where M = 2 m ). A union bound 
over all these tests therefore gives that the probability that / fails to map X 
to within e of uniform is at most 2 m ~l q, ( Ks )J . We can perform a similar union 
bound over the possible choices of the source X: there are (v\N/K such sources, 
yielding that the probability that / is not a (k, e)-RF is at most 

n \— 2 M -H Ke2 )i = 2- Q ( Ks2 ) 
k) K 

provided K > maxjlog (^),log (^)}c/e 2 for a sufficiently large constant c and 
M < c'Ke 2 for a sufficiently small constant d . Taking logarithms gives the 
result. □ 

The maxjlog (n — k), log log (?)} term in the statement of Theorem 4.3 is 
always at most log n, so the theorem always holds when k > log n + 2 log ( 1/e) + 
0(1), as discussed earlier. In the following theorem, we prove a limitation on 
the extraction properties of random functions which shows that this bound on 
k is in fact nearly tight. 

Theorem 4.4. There is a constant c such that for every n £ N, k G [n], 
and e G [0, 1/2] satisfying k < log (n — k) + 2 log (1/e) — c, a random function 
f: {0,1}" — > {0,1} will fail to be a deterministic e-extractor for (n,k)-OBFSs 

with probability at least 1 - 2~V N / K , where N = 2" and K = 2 k . 

Proof. Fix an input size n and a set L of n — k fixed bits (say, L = [n — k]). To 
say that / an e-extractor for (n, fc)-OBFSs is to say that all 2 n ~ k sets S of the 
form L*' n satisfy the following condition. 



Pr [/H = 1] - \ 



< 



Since f(w) is chosen independently for each string w € S, we can use the 
converse of our Chernoff bound (Lemma 4.2) to say that the probability that 
/ satisfies this condition for a fixed set S is at most 1 — 2~r°(- fs ' £ ")l ; where 
K = 2 k = \S\. 
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Since there are N/K subsets of the form L*' n and they are disjoint, the prob- 
ability that / will fail the above condition on none of them (i.e. the probability 
that / is a resilient function) is at most 




If the O {Ke 2 ^j term is less than or equal to 1, this probability is at most 2 N / K . 

Otherwise, it is at most 2^V N / K provided that N/K > 2 CKe2 for a sufficiently 
large constant C — 2 C . Taking logarithms twice completes the proof. □ 

Theorem 4.4 does not establish that extractors for OBFSs with the stated 
parameters do not exist; indeed, as mentioned earlier, the parity function (i.e. 
/(xi, . . . , x n ) = ®Xi) is a perfect resilient function for even k = 1. What the 
theorem does show, however, is that k « logn represents a critical point below 
which these extractors become very rare. This seems consistent with the lower 
bound on k proven in Theorem 3.6. 



4.2 Exposure-resilient functions 

We now show that probabilistically constructing exposure-resilient functions 
is easier than constructing extractors for OBFSs. This is because, while the 
adversary can choose input sources in the extractor setting, here it can only 
expose them. The probabilistic constructions of static and adaptive ERFs both 
proceed by counting the number of adversaries that must be fooled and then 
applying Lemma 4.5 (below), which is an upper bound on the probability that a 
randomly chosen function will fail to fool a fixed adversary. This lemma applies 
equally both to static and adaptive adversaries; the difference in achievable 
parameters between static and adaptive ERFs therefore stems solely from the 
fact that there are many more adversaries in the adaptive setting. 

Lemma 4.5. Let A: {0, 1}" — > {0, 1}* be an algorithm that reads at most d bits 
of its input, let e > 0, and choose a function f: {0, 1}" — > {0, l} m uniformly at 
random with m = n — d — 21og (1/e) — 0(1). Then f will fail to satisfy 

(A(U n ),f(U n ))tt e (A(U n ),U m ) 
with probability at most 2 _r H Ws ), where N = 2". 

Proof. Lemma 2.9 allows us to assume without loss of generality that A adap- 
tively reads d bits and outputs them in the order that they were read. Under 
this assumption, we have (A(U n ),U m ) — Ud+ m - We therefore need only to 
bound the probability that (A(U n ), f{U n )) is far from Ud+ m - 

Fix a statistical test T C {0, l} d x {0, l} m . In order for (A(U n )J{U n )) to 
pass this specific test of uniformity, we need / to satisfy 



Pr[(A(U n )J(U n ))eT]-^l 



<e (4.1) 



For every w g {0,1}", define /„, to be 1 if (A(w),f(w)) £ T and otherwise, 
and notice that Pr[(A(U n ), f(U n )) G T] = ^E w ^- For x € {0, l} d , let T x 
denote Tfl ({x} x {0, l}™ 1 ). Then, for a fixed w, the expectation of I w over the 
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choice of / is exactly |IV, B )|/2 m , and so by the regularity of A the expectation 
of i J2 W J w over the choice of / is \T\/2 d+m . A Chcrnoff bound (Lemma 4.1) 
then gives that the probability over the choice of / that Equation (4.1) is not 
satisfied is at most 2-L"(^ 2 )J. 

Since there are 2 DM possible choices of T in the above analysis (where D = 
2 d , M = 2 m ), a union bound shows that the probability that (A(U n ) , f (U n )) 
will fail one or more of them is at most 2 £IM 2-L"(^ 2 )J = 2- n( - Ne ^ if m = 
n — d — 2 log (1/e) — c for a sufficiently large constant c. □ 

Having established that a random function will tend to fool a fixed adver- 
sary, we now establish the existence of static and adaptive exposure-resilient 
functions. In both cases, we do so by taking a union bound over all potential 
adversaries and applying Lemma 4.5. Thus, the parameters achieved are those 
that bring the number of adversaries to below 2 Ne . 

Theorem 4.6. For every n G N, k G [n], and e > c\J nj2 n where c is a 
universal constant, a randomly chosen function f: {0, 1}™ — > {0, l} m with m < 
fc-21og(l/e)-0(l) is a static (fc, e)-ERF with probability at least 1-2'^^, 
where N = 2™. 

Proof. Every static adversary that tries to distinguish the output of / from 
uniform is an algorithm A: {0, 1}™ — > {0, iy n ~ k that reads exactly n — k bits of 
its input. We can therefore apply Lemma 4.5 with d = n — k to get that the 
probability that / will fail to fool any one adversary is at most 2~ nl - Ne >. Taking 
a union bound over the (?) possible adversaries, we get that the probability that 
/ will not fool all adversaries is at most 



where the final equality is given by the constraint on e. □ 

Counting the number of adversaries in the adaptive setting is a bit more 
work, but Lemma 2.9 from our preliminaries simplifies this task. 

Theorem 4.7. For every n G N, k G [n], and e > 0, a randomly chosen 
function f: {0,1}" {0, l} m withm< k-2 log (l/e)-0(l) andk > loglogn+ 
2 log (l/e)+0(l) is an adaptive (k, e)-ERF with probability at least \ — 2~ n ( Ne \ 
where N = 2". 

Proof. The proof is identical to that of Theorem 4.6 except that we have to 
count the number of adaptive adversaries. We do so below. 

First we note that Lemma 2.9 implies that if / fools all adaptive adversaries 
that output the bits they read as they read them, then / fools all adaptive 
adversaries. We therefore only need to count this smaller set of adversaries. 
The process by which such an adversary chooses which bits to request can be 
modelled by a decision tree of depth n—k—1 whose internal nodes are labelled by 
elements of [n]. Since the number of nodes in such a tree is 2™ -fe ~ 1 — 1 < N/2K, 
where N = 2 n and K = 2 k , we can bound the total number of trees — and 
therefore adversaries — by n N / 2K . 

Proceeding with the same kind of union bound as in the proof of Theo- 
rem 4.6, we see that the probability that / will not fool all adaptive adversaries 



is at most n N/2K 2 -n(Ne 2 ) = 2 -n(Ne 2 )^ prov id e d that K > {c\ogn)/e 2 for a 




sufficiently large constant c. Taking logarithms yields the theorem. 



□ 



12 



5 Future work 



The general question of whether there exist resilient functions with large output 
length in the low-entropy range studied here is still unresolved. 

Open Question 1. Does there exist, for all n £ N and some growing function 
< k(n) < \ogn, a deterministic e-extractor for (n, fc(n)-OBFSs with output 
length m — 0(fc(n)) and e constant? 

Theorem 3.6 shows that to resolve this question in the positive direction 
requires a function that is cither not computable by a forgetless streaming al- 
gorithm or uses a considerable amount of space. In the other direction, an 
interesting step towards a negative result would be to at least remove the for- 
getlessness condition from the space lower bound proven in that theorem. 

We can ask an analogous question for the case of adaptive ERFs with k < 
log log n. 

Open Question 2. Does there exist, for all n £ N and some growing func- 
tion < k(n) < log log ti, an adaptive (fc(n), e)-ERF with output length m — 
fi(fc(n)) and e constant? 

In this case, we cannot even rule out the possibility that a more clever use 
of the probabilistic method will resolve this question positively. Thus, a first 
step toward a negative result might be to prove an analogue to Theorem 4.4 
that shows that adaptive ERFs with near-optimal output length become very 
rare when k < log log n. 

A third open problem arising from this work is that of finding an explicit con- 
struction of a static ERF with the parameters achieved using the probabilistic 
method in Theorem 4.6. Currently, an output length of Sl(k) is achieved in [DSS] 
using strong extractors, but the construction works only when k > \ogn. For k 
smaller than log n, there is no known construction of a static ERF that is not 
also an RF, making the construction of Theorem 3.6 the current state of the 
art. This leaves us with the following open question: 

Open Question 3. Does there exist, for all n £ N and some growing function 
< k(n) < log 7i, an explicit static (fc(n),e)-ERF with output length m — 
fi(fc(n)) and e constant? 
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A Proof sketch of Lemma 4.2 



Lemma. Let X\ , . . . , Xt represent the results of independent, unbiased coin 
flips, and let X — Xi)/t. Then for every < e < 1/2, we have 



Pr 











x- 1 


> e 




2 





> 



2 -ro( te 2 )i 



Proof Sketch. We address three separate cases: < e < j^=, < £ < |, and 
| < e < |. In the first case, we upper-bound the probability that \X — ~ < 
e using the fact that no term of the binomial distribution exceeds ^2/ivt in 
probability mass. In the second case, we set (3 — h + 2e and use Stirling's 
approximation to lower-bound the probability by 



[et\ 



t 



/2' 



> 



[st\ 



1 1 

^2*^(1- 



where the first inequality is from Stirling's approximation. In the third case, we 
just lower-bound the probability by 2 . □ 



B Proof of Claim 3.2 

Claim. Let Wk be the distribution on the vertices ofZ/MZ (where M = 2 m ) 
obtained by beginning at and adding 1 or with equal probability k times. The 
distance from uniform of Wk is at most 

e -k7T 2 /2M 2 

2 (1 - e - 3k * 2 / 2M2 ) 

Proof. Consider Z/MZ as an additive group, and let P be the probability dis- 
tribution on Z/MZ that equals with probability 1/2 and 1 otherwise. Then 
the distribution on Z/MZ after k steps of our random walk is p* n . the n-th 
convolution of P with itself. 

Lemma 1 in Chapter 3 of [Dia] bounds the distance between P* n and the 
uniform distribution in terms of the traces of the Fourier transforms by P* n 
of the non-trivial irreducible representations of Z/MZ. This simplifies nicely 
since the Fourier transform P* n (p) of a representation p by P* n equals (P(p)) n , 
the n-th power of the Fourier transform of p by P. Since there is one non- 
trivial irreducible representation for each j £ [M — 1] , we therefore arrive at the 
following upper bound for the distance from uniform after k random steps. 

To bound this sum, we first note that ^ + \ cos(a;) < e~ x / 8 for x £ [0,ir]. 
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This, together with the fact that M — 2 m is even, allows us to write 
4i:{ 2 + 2 C0S {w = 2 D {2 + 2 COS {-M 

(M-2)/2 



1 E 

2 ^ 

oo 

< i e -fc7r 2 /2M 2 ^ 



< J > « 



e 

1 OO 

J- 7^^-2/1 71/7" 2 X > QiU — 2 ^/07l/T 2 

2* 



3=0 
e -kTi 2 /2M 2 

~ 2(l - e -3fe7r 2 /2M 2 ^ 

which is the desired result. □ 
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